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Abstract: Comparing complete animal mitochondrial genome sequences is becoming increasingly common as a model for 
genome evolution and phylogenetic reconstruction. In the present work, we compare the complete mitochondrial genome 
sequences of five species of cetaceans and artiodactyls and infer phylogenetic relationships among them. The genome of the 
taxa contains the 37 genes found in a typical mammalian genome, a general structure that is highly-conserved among species. 
Phylogenetic trees constructed using MP, ML, and BI methods show a similar topology, and indicate the paraphyly of 
Artiodactyla, due to the sister- group relationship between the Cetacea and the Hippopotamidae. The study confirms that 
mitogenomics is a useful tool for research on mammal phylogenetics, but recognizes that increased taxon sampling is still 
required to resolve existing differences between nuclear and mitochondrial gene trees. 
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1. Introduction 

The sequence and structure of mitochondrial (mt) 
genomes are widely used to provide information on 
comparative and evolutionary genomics, molecular 
evolution, gene flow patterns, phylogenetics, and 
population genetics [1, 2]. Several recent analyses have 
demonstrated that complete mt genomes provide higher 
levels of support than those based on individual or partial 
mt genes [3, 4, 5]. The mammalian mitochondrial genome 
is a closed-circular, double-stranded DNA molecule about 
16.5 kb in length that usually encodes genes for only 
thirteen protein products, two ribosomal RNAs, and 22 
transfer RNAs [6]. 

Cetartiodactyla is one of the most diversified orders of 
mammals, with 332 extant species grouped into 132 genera. 
The members of this taxonomic group were originally 
divided into two different orders: Artiodactyla and Cetacea 

[7] , but recent literature suggests a close relationship 
between these two orders based on a host of paleontological 

[8] , morphological [9], and molecular [10, 11, 12] studies. 
Molecular data analyses, in particular, indicate that the 
Cetacea are sisters of the Hippopotamidae [10, 13], thus 
suggesting that all species of Artiodactyla and Cetacea 



belong to one sole order, called Cetartiodactyla [14]. This 
order includes all species of Cetacea, Hippopotamidae, 
Antilocapridae, Bovidae, Cervidae, Giraffidae, Moschidae, 
Tragulidae, Suidae, Tayassuidae, and Camelidae. 

This study compared complete mitochondrial genomes 
from species of Artiodactyla and Cetacea in order to gain a 
better understanding of the molecular evolution of the 
mitochondrial genome and infer phylogenetic relationships 
among these groups. 

2. Materials and Methods 

Complete mitochondrial genomes were downloaded from 
GenBank and the heavy-strand encoded protein-coding genes 
were aligned according to [15]. The analyses based on 
complete mitochondrial genomes included five species: two 
cetaceans ( Balaenoptera musculus - GenBank number 
X72204; Caperea marginata - GenBank number AP006475), 
and three artiodactyls ( Hippopotamus amphibius - GenBank 
number AP003425; Moschus chrysogaster - GenBank 
number NC_020093; Tayassu tajacu - GenBank number 
AP003427). 
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2.1 . Comparative Analyses 

Each one of the five mitochondrial protein-coding genes 
was compared. The initial and final codons were predicted 
for 13 protein-coding genes by comparing the amino acid 
sequences. Ribosomal RNA (rRNA) and transfer RNA 
(tRNA) genes were also identified by comparison with 
other complete mitochondrial rRNAs from cetardiodactyls. 
All comparisons were performed using CLC Genomics 
Workbench software version 8.0.1 (CLC bio). 

2.2. Phylogenetic Analyses 

The complete mitochondrial genomes were aligned using 
BioEdit [16], with subsequent visual confirmation. All 
those regions that showed ambiguity due to the position of 
the gaps were excluded from the analyses to avoid 
erroneous hypotheses of primary homology. Thus, the 
control region was not included in the alignment because of 
the presence of many ambiguous gaps. The final alignment 
consisted of 13,813 nucleotide sites (nt). 

Phylogenetic reconstructions were obtained under 
Maximum Parsimony (MP) and Maximum Likelihood (ML) 
criteria with PAUP* 4.0b 10 [17]. Under MP, trees were 
obtained considering gaps as a fifth state and after a 
heuristic search with an initial tree obtained via stepwise 
addition (random input order) of the taxa, followed by 
complete tree-bisection-reconnection (TBR) branch- 
swapping. The robustness of the topology was then assessed 
through bootstrapping [18] after 10,000 replicates. The 
most appropriate model of nucleotide evolution was 
selected using ModelTest 3.7 [19]. A heuristic ML search 
was then conducted using this model. Again, the robustness 
of the topology was evaluated using bootstrap resampling 
[18] after 10,000 replicates. 

Bayesian phylogenetic analysis (BI) was also performed 
using MrBayes 3.1.2 [20]. The best- fit model (GTR+I+G) 
of sequence evolution for Bayesian analysis was obtained 
by ModelTest 3.7 [19] under the Akaike Information 
Criterion, with the following parameters: Nst = 6, rates = 
gamma, Ngen = 6,000,000, frequency = 100, chains = 5. 
After eliminating the first 20,000 trees as “burn-in,” we 
constructed a majority-rule consensus tree with Bayesian 
posterior probabilities for each node. 

In accordance with [12], trees were rooted using 
sequences from Equus asinus (GenBank number 
NC_001788) and Equus caballus (GenBank number 
NC_001640). 

3. Results and Discussion 

3.1. Characteristics of the Mitochondrial Genome 

The five mitochondrial genomes represent circular, 
double-stranded DNA molecules that ranged from 16,353 to 
16,836 nucleotides and contained the 37 genes, including 



22 tRNA genes, two rRNA genes (rrnS and rrnL), 13 
protein-coding genes (COX 1-3, NAD 1-6, NAD4L, 
cytochrome b and ATP6,8), and a non-coding region (Table 
1 and Figure 1). Of these genes, 8 tRNAs and ND6 are 
encoded by the L-strand, while all others are encoded by 
the H-strand. Arrangement is the same in all species of 
cetaceans and artiodactyls, and has been seen to be similar 
in other mammalian genomes, such as Artibeus jamaicensis 
bats [21] and Rhinolophus monoceros [22] that, according 
to [23], are closely related phylogenetically to cetaceans 
and artiodactyls. 

All protein-coding genes of the mtDNA of the five 
species have a methionine start codon (ATR), except ND3 
in H. amphibius (ATT) and ND5 in M. chrysogaster. Also, 
all present TTA in ND6 and, with the exception of M. 
chrysogaster , all have GTC in ND4L. Most protein-coding 
genes appear to be terminated by TAR, though this stop 
codon is incomplete in various genes, such as ND2, COI, 
ATP8, ND4L, and ND5. 

Overlapping nucleotides between the mt genes of the 
species examined in this study ranged from 1 to 45 base 
pairs (bp), (Table 1), the same as in other cetaceans, such as 
Orcinus orca [24] and artiodactyls like Bison bison [25]. 

The overall base composition of the mt genome shows no 
significant differences between the cetacean (B. musculus 
and C. marginata) and artiodactyl species ( H . amphibius , M. 
chrysogaster and T. tajacu , Figure 2). In fact, the greatest 
differences were found between species of artiodactyls. The 
percentages of nucleotides found were: C (27.6, 26.6 vs. 

28.6, 25.0, 27.0%), G (13.0, 13.0 vs. 14.0, 12.9, 13.5%), A 
(32.8, 32.9 vs. 32.7, 34.0, 34.2%), and T (26.6, 27.5 vs. 

24.6, 28.1,25.3%). 

Sequence analyses revealed the presence of two 
ribosomal RNA genes (rRNA): 12S rRNA and 16S rRNA. 
The size and arrangement of these genes concur with those 
of other cetaceans and artiodactyls. The sizes of the 12S 
rRNA and 16S rRNA were similar in all species, with a 
range of 951-974 and 1565-1576, respectively. The 12S 
rRNA gene is located between tRNAPhe and tRNAVal 
while the 16S rRNA gene is between tRNAVal and 
tRNALeu. 

Non-coding regions were also identified, including the 
origin of the replication of the light strand (OL) and control 
region, which is important for the replication and 
maintenance of the mt-genome [26]. The OL region was 
identified between tRNAAsn and tRNACys, and varied 
from 32 bp to 37 bp, similar to such bat species as 
Corynorhinus rafinesquii , Lasiurus borealis and Artibeus 
lituratus [27]. The sequence lengths of the control region, 
meanwhile, presented a slight variation in the five species, 
of 916-959 bp. This region exhibited higher AT than GC 
content, similar to other metazoans [28]. The highest 
amounts of GC were determined to be 41.2% for H. 
amphibius and 38.6% for B. musculus. 
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Table 1. Characteristics of the mitochondrial genome in two species of cetaceans and three species of artiodactyls. 





Location (Size in bp) 










Name of gene 


Cetacea 




Artiodactyla 






Strand 




Balaenoptera 


Caperea 


Hippopotamus 


Moschus 


Tayassu 






musculus a 


marginata b 


amphibius c 


chrysogaster d 


tajacu e 




tRNA-Phe 


422-494 (73) 


1-73 (73) 


1-71 (71) 


1-68 (68) 


1-70 (70) 


H 


12S ribosomal RNA 
(12S) 


496-1467 (972) 


76-1049 (974) 


72-1039 (968) 


69-1023 (955) 


70-1020 

(951) 


H 


tRNA-Val 


1468-1534 (67) 


1050-1116 (67) 


1040-1108 (69) 


1024-1090 (67) 


1020-1087 

(68) 


H 


16S ribosomal RNA 
(16S) 


1535-3109 (1575) 


1117-2692 

(1576) 


1109-2673 (1565) 


1091-2659 (1569) 


1088-2661 

(1574) 


H 


tRNA-Leu (UUR) 


3110-3184 (75) 


2963-2767 (75) 


2674-2747 (74) 


2662-2736 (75) 


2661-2735 

(75) 


H 


NADH dehydrogenase 
subunit 1 (ND1) 


3187-4143 (957) 


2770-3726 

(957) 


2750-3706 (957) 


2740-3695 (956) 


2738-3694 

(957) 


H 


tRNA-Ile 


4143-4211 (69) 


3726-3794 (69) 


3706-3774 (69) 


3696-3764 (69) 


3693-3762 

(70) 


H 


tRNA-Gln 


4209-4281 (73) 


3792-3864 (73) 


3772-3843 (72) 


3762-3833 (72) 


3760-3832 

(73) 


L 


tRNA-Met 


4283-4351 (69) 


3866-3934 (69) 


3845-3913 (69) 


3836-3904 (69) 


3833-3902 

(70) 


H 


NADH dehydrogenase 
subunit 2 (ND2) 


4352-5395 (1044) 


3935-4978 

(1044) 


3914-4957 (1044) 


3905-4946 (1042) 


3903-4946 

(1044) 


H 


tRNA-Trp 


5394-5462 (69) 


4977-5044 (68) 


4956-5022 (67) 


4947-5013 (67) 


4945-5012 

(68) 


H 


tRNA-Ala 


5468-5536 (69) 


5050-5117 (68) 


5027-5095 (69) 


5015-5083 (69) 


5018-5085 

(68) 


L 


tRNA-Asn 


5538-5611 (74) 


5119-5192 (74) 


5098-5171 (74) 


5085-5157 (73) 


5094-5167 

(74) 


L 


Origin of L-strand 
replication (OL) 


5611-5647 (37) 


5192-5228 (37) 


5172-5203 (32) 


5158-5190 (33) 


5166-5199 

(33) 




tRNA-Cys 


5644-5711 (68) 


5225-5292 (68) 


5204-5271 (68) 


5191-5256 (66) 


5200-5266 

(67) 


L 


tRNA-Tyr 


5712-5777 (66) 


5293-5358 (66) 


5272-5337 (66) 


5257-5324 (68) 


5266-5333 

(68) 


L 


Cytochrome c oxidase 
subunit I (COI) 


5779-7329 (1551) 


5360-6910 

(1551) 


5339-6895 (1557) 


5326-6870 (1545) 


5335-6879 

(1545) 


H 


tRNA-Ser (UCN) 


7325-7395 (71) 


6906-6976 (71) 


6889-6959 (71) 


6868-6936 (69) 


6883-6951 

(69) 


L 


tRNA-Asp 


7401-7468 (68) 


6982-7049 (67) 


6965-7031 (67) 


6944-7011 (68) 


6958-7026 

(69) 


H 


Cytochrome c oxidase 
subunit II (COII) 


7469-8152 (684) 


7050-7733 

(684) 


7032-7715 (684) 


7013-7696 (684) 


7027-7722 

(696) 


H 


tRNA-Lys 


8156-8223 (68) 


7737-7804 (68) 


7719-7782 (64) 


7700-7767 (68) 


7715-7781 

(67) 


H 


ATP synthase F0 
subunit 8 (ATP8) 


8225-8416 (192) 


7806-7997 

(192) 


7784-7990 (207) 


7769-7969 (201) 


7783-7986 

(204) 


H 


ATP synthase F0 
subunit 6 (ATP6) 


8386-9066 (681) 


7967-8647 

(681) 


7945-8625 (681) 


7930-8610 (681) 


7944-8624 

(681) 


H 


Cytochrome c oxidase 
subunit III (COIII) 


9066-9851 (786) 


8647-9432 

(786) 


8625-9408 (784) 


8610-9393 (784) 


8624-9407 

(784) 


H 


tRNA-Gly 


9851-9919 (69) 


9432-9500 (69) 


9410-9479 (70) 


9394-9462 (69) 


9408-9475 

(68) 


H 


NADH dehydrogenase 
subunit 3 (ND3) 


9920-10266 (347) 


9501-9847 

(347) 


9480-9826 (347) 


9463-9809 (347) 


9477-9823 

(347) 


H 


tRNA-Arg 


10267-10335 (69) 


9848-9916 (69) 


9827-9895 (69) 


9810-9878 (69) 


9824-9891 

(68) 


H 


NADH dehydrogenase 
subunit 4L (ND4L) 


10336-10632 

(297) 


9917-10213 

(297) 


9896-10192 (297) 


9879-10175 (297) 


9892-10188 

(297) 


H 


NADH dehydrogenase 


10626—12003 


10207-11584 


10186-11563 


10169-11546 


10182- 

11559 

(1378) 


H 


subunit 4 (ND4) 


(1378) 


(1378) 


(1378) 


(1378) 



Start 

codon 



ATG 



ATA 



ATG 



ATG 

ATG 

ATG 

ATG 

ATA a,b,d,e 

ATT C 

GTG a,b,c,e 

ATG d 

ATG 



Stop codon 



TAA a,b,c,d 

TAG 6 



TAG a,b,c,e 

TAA d 



AGA a,b,c 

TAA d 

TAG 6 



TAA a,b,d 

TAG 6 

AGA 6 



TAA a,b,d,e 

TAG 6 

TAA 

TAG a,b ’ ce 

TAA d 



TAA 



TAA 

TAA 
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Location (Size in bp) 












Name of gene 


Cetacea 




Artiodactyla 






Strand 


Start 

codon 




Balaenoptera 
musculus a 


Caperea 

marginata b 


Hippopotamus 
amphibius c 


Moschus 

chrysogaster A 


Tayassu 
tqjacu e 




tRNA-His 


12004-12072 (69) 


11585-11653 

(69) 


11564-11633 (70) 


11547-11616(70) 


11560- 
11627 (68) 


H 




tRNA-Ser (AGY) 


12073-12133 (61) 


11654-11714 

(61) 


11634-11692 (59) 


11617-11676 (60) 


11628- 
11686 (59) 


H 




tRNA-Leu (CUN) 


12135-12204 (70) 


11716-11785 

(70) 


11694-11763 (70) 


11678-11747 (70) 


11687- 
11756 (70) 


H 




NADH dehydrogenase 
subunit 5 (ND5) 


12205-14025 

(1821) 


11786-13606 

(1821) 


11764-13584 

(1821) 


11748—13568 

(1821) 


11757- 

13577 

(1821) 


H 


ATA a ' b ' ( 

ATT d 


NADH dehydrogenase 
subunit 6 (ND6) 

tRNA-Glu 


14009-14536 

(528) 

14537-14605 (69) 


13590-14117 

(528) 

14118-14186 

(69) 


13568-14098 

(531) 

14096-14164 (69) 


13552-14079 

(528) 

14080-14148 (69) 


13561- 
14088 (528) 
14089- 
14157(69) 


L 

L 


ATG 


Cytochrome b (Cytb) 


14610-15749 

(1140) 


14191-15330 

(1140) 


14169-15308 

(1140) 


14153-15292 

(1140) 


14162- 

15301 

(1140) 


H 


ATG 


tRNA-Thr 


15750-15821 (72) 


15331-15402 

(72) 


15309-15379 (71) 


15296-15365 (70) 


15302- 
15371 (70) 


H 




tRNA-Pro 


15821-15887 (67) 


15402-15468 

(67) 


15378-15443 (66) 


15365-15430 (66) 


15371- 
15435 (65) 


L 




D-loop 


15888-16402, 1- 
421 (935) 


15469-16384 

(916) 


15444-16402 

(959) 


15431-16353 

(923) 


- 







Stop codon 



TAA 

TAA 



AGA 



Balaenoptera 

musculus 



itRNA-Trp 

,tRNA-lle tRNA-Tyr 




tRNA-Gly 




itRNA-Ser 
tRNA-His 
itRNA-Leu 
| .r J-DH: 



itRNA-Trp 




Hippopotamus 

amphibius 



Moschus 

chrysogaster 




tRNA-Cys, 




Figure 1. Comparative mitogenomics between cetacean and artiodactyl species. In blue arrows shown the 13 essential respiratory chain sub-units: ND1-ND6 
(NADH dehydrogenase sub-unit 1-6) and ND4L (NADH dehydrogenase sub-unit 4L gene), CYT B (cytochrome b gene), COX 1-COX 3 (cytochrome c oxidase 
1-3), and the ATPase6 and ATPase8 genes. Also shown are the two ribosomal RNA genes (12S rRNA and 16S rRNA) (red arrow), and the 22 transfer RNA 
genes (red arrow). Transfer RNA genes are Phe, Val, Leu, He, Gin, Met, Trp, Ala, Asn, Cys, Tyr, Ser, Asp, Lys, Gly, Arg, His, Glu, Thr, and Pro. 
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o 
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o 
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3 

O 

Cr 
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100/100 



90/100 



100/100 



Figure 2. Phylogenetic relationship of species of cetaceans and artiodactyls 
based on 13,813 nucleotides using MP, ML, B I analysis. Numbers below the 
line indicate the posterior probabilities of the nodes in the Bayesian 
inference analysis; numbers above the line are the bootstrap values of the 
nodes in the ML or ML/MP analyses, after 1,000 replicates. Color online: 
overall base composition of the mitochondrial genome of the species of 
cetaceans and artiodactyls. 

3.2. Phytogeny 

Maximum Parsimony, Maximum Likelihood and 
Bayesian phylogenetic analyses based on an alignment of 
13,813 bp resulted in a robust tree (Figure 2) with high 
bootstrap values that support (100%) interspecific 
relationships. Results indicate the paraphyly of Artiodactyla 
due to the sister-group relationship between Cetacea and 
Hippopotamidae. According to [12], these relationships 
formed the Whippomorpha group. Moreover, they are in 



full agreement with recent works that suggest a close 
assembly between Artiodactyla and Cetacea based on a host 
of paleontological [8], morphological [9], and molecular 
[10, 11, 12] studies. To render taxonomy compatible with 
this phylogeny, Montgelard et al. [14] proposed placing all 
species of Artiodactyla and Cetacea in the same order, 
called Cetartiodactyla. 

Genetic distances based on the evolutionary model 
selected by ModelTest were lower among the species of 
cetaceans (12.1%) than among the artiodactyls (22.5- 
23.3%). These latter values were similar to those seen in the 
between-taxa comparisons of the two orders (22.0-24.8%). 
These results provide a reasonable basis of support for 
mitogenomic relationships within Cetartiodactyla. Although 
anatomists had strongly claimed the monophyly of 
Artiodactyla for 150 years, it has been identified in all 
extant terrestrial cetartiodactyls and Eocene whales, a 
common ancestor of Cetartiodactyla that exhibits a double- 
pulley astragalus and a paraxonic foot [29]. 

4. Conclusions 

This study presents the first comparison of entire 
mitochondrial genomes for various cetaceans and 
artiodactyls while also providing support for the notion that 
mitogenomics is a powerful tool for investigating genome 
evolution and phylogenetic relationships in mammals and 
suggesting that the general structure among species of 
cetaceans and artiodactyls is highly-conserved. Finally, the 
comparison of phylogenetic results requires the use of 
biparental inherited nuclear markers (i.e., introns). 
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